Skip to content

perf: implement NextGEQ#345

Open
cheb0 wants to merge 32 commits intomainfrom
332-next-geq
Open

perf: implement NextGEQ#345
cheb0 wants to merge 32 commits intomainfrom
332-next-geq

Conversation

@cheb0
Copy link
Member

@cheb0 cheb0 commented Feb 6, 2026

Description

A new operation which boosts search and aggregation performance. It's implemented on top of faster cmp branch because fast compares are vital for this operation. It's also a base for future improvements like skipping disk reads.

Currently, NextGeq works more like a hint - some nodes do not support it and NextGeq behaviour is same as Next. Therefore, some operations still have loops even when calling NextGeq (i.e. scrolling past lower values), like in aggregator.go. Might be improved in future.

Measurements

Performance improvement depends on a particular operation and search request (and even tokens). If we can skip a lot, then there is a lot of improvement.

For example, service:X AND level:3 speeds up by 60% when level:3 is used, and speeds up only by 6% when level:4 is used as second predicate. Because we reduce total number of results with level:3 by a lot, while almost all service logs has level:4.

The performance improvement on typical user aggregation is good, since we skip a lot in agg tree.
service:xyz | group by k8s_pod count(*) (prod fracs)

master: 370-456 ms (hot-cold)
fast cmp: 320-402 ms
next geq: 73-153 ms (-50-70%)

There are also downsides. It's not zero-cost. For example, the case when we iterate the whole tree and there is nothing to skip (i.e. we traverse exactly same paths but there is an extra comparison in NextGeq operation) :
exists:service | group by k8d_pod count(*)
master: 690-821 ms
fast cmp: 600-730 ms
next geq: 662-800 ms (+10% skipping overhead)

However, thanks to faster cmp, it will still be faster than current main branch, i.e. there is no performance downgrade. The next steps are probably to implement some sort of cost based planning (and cardinality estimation) where we can evaluate if there is a skipping potential and disable it altogether, or maybe even avoid constructing an aggregation tree.

This PR partially adresses #332


  • I have read and followed all requirements in CONTRIBUTING.md;
  • I used LLM/AI assistance to make this pull request;

@codecov-commenter
Copy link

codecov-commenter commented Feb 7, 2026

Codecov Report

❌ Patch coverage is 96.44970% with 6 lines in your changes missing coverage. Please review.
✅ Project coverage is 71.77%. Comparing base (8e3ecad) to head (8c9819e).

Files with missing lines Patch % Lines
node/node_or.go 96.29% 1 Missing and 1 partial ⚠️
node/node_range.go 0.00% 2 Missing ⚠️
frac/sealed/lids/iterator_asc.go 94.11% 1 Missing ⚠️
frac/sealed/lids/iterator_desc.go 94.44% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #345      +/-   ##
==========================================
+ Coverage   71.52%   71.77%   +0.25%     
==========================================
  Files         204      205       +1     
  Lines       14812    14978     +166     
==========================================
+ Hits        10594    10751     +157     
- Misses       3454     3459       +5     
- Partials      764      768       +4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@cheb0 cheb0 changed the title perf: Next GEQ perf: implement NextGEQ Feb 9, 2026
@dkharms dkharms added the performance Features or improvements that positively affect seq-db performance label Feb 10, 2026
@ozontech ozontech deleted a comment from github-actions bot Feb 13, 2026
@ozontech ozontech deleted a comment from github-actions bot Feb 13, 2026
@ozontech ozontech deleted a comment from github-actions bot Feb 13, 2026
@ozontech ozontech deleted a comment from github-actions bot Feb 13, 2026
@ozontech ozontech deleted a comment from github-actions bot Feb 13, 2026
@ozontech ozontech deleted a comment from github-actions bot Feb 13, 2026
@ozontech ozontech deleted a comment from seqbenchbot Feb 13, 2026
@ozontech ozontech deleted a comment from github-actions bot Feb 13, 2026
@ozontech ozontech deleted a comment from seqbenchbot Feb 15, 2026
@ozontech ozontech deleted a comment from seqbenchbot Feb 15, 2026
@ozontech ozontech deleted a comment from github-actions bot Feb 15, 2026
@ozontech ozontech deleted a comment from seqbenchbot Feb 15, 2026
@ozontech ozontech deleted a comment from dkharms Feb 15, 2026
@ozontech ozontech deleted a comment from seqbenchbot Feb 15, 2026
@ozontech ozontech deleted a comment from dkharms Feb 15, 2026
@ozontech ozontech deleted a comment from seqbenchbot Feb 15, 2026
@ozontech ozontech deleted a comment from seqbenchbot Feb 15, 2026
@github-actions
Copy link
Contributor

🔴 Performance Degradation

Some benchmarks have degraded compared to the previous run.
Click on Show table button to see full list of degraded benchmarks.

Show table
Name Previous Current Ratio Verdict
And/size=1000000-4 8e3eca f18094
4.58 ns/op 5.12 ns/op 1.12 🔴
OrTreeNextGeq/size=1000-4 ------ f18094
NaN B/op 0.00 B/op NaN 🔴
NaN allocs/op 0.00 allocs/op NaN 🔴
NaN ns/op 4.79 ns/op NaN 🔴
OrTreeNextGeq/size=10000-4 ------ f18094
NaN B/op 0.00 B/op NaN 🔴
NaN allocs/op 0.00 allocs/op NaN 🔴
NaN ns/op 4.89 ns/op NaN 🔴
OrTreeNextGeq/size=1000000-4 ------ f18094
NaN B/op 0.00 B/op NaN 🔴
NaN allocs/op 0.00 allocs/op NaN 🔴
NaN ns/op 5.36 ns/op NaN 🔴

@ozontech ozontech deleted a comment from seqbenchbot Feb 16, 2026
@ozontech ozontech deleted a comment from seqbenchbot Feb 16, 2026
@seqbenchbot
Copy link

Nice, @cheb0 <(-^,^-)=b!

The benchmark with identificator e3ad695a was finished.
I've prepared a summary for you. Click on Show summary button to see it:

Show summary
Query Type mean (ms) stddev (ms) p(50) (ms) p(95) (ms) p(99) (ms) iterations
base comp diff base comp diff base comp diff base comp diff base comp diff base comp diff
service:api-gateway-us
 | group by (method) | avg(size)
cold 1051.60 900.00 -14.42% 86.29 43.08 -50.07% 1025.00 886.00 -13.56% 1135.00 941.00 -17.09% 1135.00 941.00 -17.09% 5.00 5.00 0.00%
service:api-gateway-us
 | group by (method) | avg(size)
warm 304.16 215.56 -29.13% 25.15 25.92 +3.05% 293.50 214.50 -26.92% 349.50 253.50 -27.47% 367.50 261.00 -28.98% 25.00 25.00 0.00%
service:payment-backend-eu
 | group by (k8s_pod) | min(level)
cold 1094.40 772.60 -29.40% 104.23 79.66 -23.57% 1046.50 747.50 -28.57% 1179.50 840.50 -28.74% 1179.50 840.50 -28.74% 5.00 5.00 0.00%
service:payment-backend-eu
 | group by (k8s_pod) | min(level)
warm 360.52 112.92 -68.68% 22.93 17.05 -25.64% 355.50 113.50 -68.07% 395.50 140.00 -64.60% 401.00 142.00 -64.59% 25.00 25.00 0.00%
service:payment-backend-eu
 | group by (k8s_pod, 10s) | min(level)
cold 2996.20 2802.20 -6.47% 472.34 274.97 -41.78% 2803.00 2667.00 -4.85% 3375.00 3052.50 -9.56% 3375.00 3052.50 -9.56% 5.00 5.00 0.00%
service:payment-backend-eu
 | group by (k8s_pod, 10s) | min(level)
warm 407.08 147.20 -63.84% 32.77 20.41 -37.71% 401.00 137.50 -65.71% 455.50 185.00 -59.39% 467.00 194.50 -58.35% 25.00 25.00 0.00%
service:payment-backend-eu
 | group by (k8s_pod) | count
cold 890.60 719.40 -19.22% 112.98 95.05 -15.87% 836.50 721.00 -13.81% 983.50 795.00 -19.17% 983.50 795.00 -19.17% 5.00 5.00 0.00%
service:payment-backend-eu
 | group by (k8s_pod) | count
warm 272.36 102.24 -62.46% 27.20 10.37 -61.87% 271.50 103.50 -61.88% 311.50 115.00 -63.08% 324.00 115.50 -64.35% 25.00 25.00 0.00%
service:*
 | group by (10s) | count
cold 2576.60 2823.80 +9.59% 131.84 593.43 +350.12% 2566.50 2521.50 -1.75% 2679.50 3378.00 +26.07% 2679.50 3378.00 +26.07% 5.00 5.00 0.00%
service:*
 | group by (10s) | count
warm 304.00 317.24 +4.36% 31.64 38.19 +20.68% 293.50 305.00 +3.92% 344.00 390.50 +13.52% 378.00 404.00 +6.88% 25.00 25.00 0.00%
k8s_namespace:prod
AND level:[0 to 3]
 | group by (15s) | count
cold 2586.00 2354.80 -8.94% 184.65 386.42 +109.28% 2587.00 2144.00 -17.12% 2733.50 2695.00 -1.41% 2733.50 2695.00 -1.41% 5.00 5.00 0.00%
k8s_namespace:prod
AND level:[0 to 3]
 | group by (15s) | count
warm 78.16 58.24 -25.49% 16.78 8.34 -50.29% 81.50 61.00 -25.15% 96.50 66.50 -31.09% 97.50 67.50 -30.77% 25.00 25.00 0.00%
service:payment-backend-eu
 | group by (1s) | count
cold 2211.20 2348.60 +6.21% 205.77 521.84 +153.60% 2165.50 2179.00 +0.62% 2390.50 2761.00 +15.50% 2390.50 2761.00 +15.50% 5.00 5.00 0.00%
service:payment-backend-eu
 | group by (1s) | count
warm 60.96 66.12 +8.46% 13.46 11.83 -12.10% 67.00 68.50 +2.24% 76.00 77.50 +1.97% 79.00 82.00 +3.80% 25.00 25.00 0.00%
service:* | group by (k8s_pod) | count
cold 1458.00 1470.60 +0.86% 66.16 74.87 +13.17% 1436.00 1484.00 +3.34% 1516.00 1520.50 +0.30% 1516.00 1520.50 +0.30% 5.00 5.00 0.00%
service:* | group by (k8s_pod) | count
warm 708.64 723.08 +2.04% 63.87 65.67 +2.83% 687.00 717.00 +4.37% 814.00 831.50 +2.15% 875.50 874.00 -0.17% 25.00 25.00 0.00%
service:payment-backend-us
cold 52.00 62.20 +19.62% 12.67 17.68 +39.58% 44.50 57.50 +29.21% 64.50 77.50 +20.16% 64.50 77.50 +20.16% 5.00 5.00 0.00%
service:payment-backend-us
warm 3.24 3.36 +3.70% 0.44 0.49 +12.39% 3.00 3.00 0.00% 4.00 4.00 0.00% 4.00 4.00 0.00% 25.00 25.00 0.00%
k8s_pod:payment-backend-us-*
AND transaction_id:'tx-needle99-0000'
cold 323.60 330.20 +2.04% 56.84 59.83 +5.27% 308.50 303.00 -1.78% 371.50 389.00 +4.71% 371.50 389.00 +4.71% 5.00 5.00 0.00%
k8s_pod:payment-backend-us-*
AND transaction_id:'tx-needle99-0000'
warm 2.56 2.48 -3.13% 0.96 0.59 -39.02% 2.00 2.00 0.00% 4.00 3.00 -25.00% 5.00 3.50 -30.00% 25.00 25.00 0.00%
service:payment-backend-eu
 | group by (k8s_pod, 10s) | count
cold 2815.00 2593.40 -7.87% 418.30 298.99 -28.52% 2707.00 2646.50 -2.23% 3201.00 2792.00 -12.78% 3201.00 2792.00 -12.78% 5.00 5.00 0.00%
service:payment-backend-eu
 | group by (k8s_pod, 10s) | count
warm 307.16 135.00 -56.05% 32.03 21.66 -32.37% 295.50 130.50 -55.84% 359.00 171.50 -52.23% 374.50 180.00 -51.94% 25.00 25.00 0.00%
service:api-gateway-us
 | group by (method) | avg(size)
cold 1076.00 900.00 -16.36% 72.53 43.08 -40.60% 1035.50 886.00 -14.44% 1152.50 941.00 -18.35% 1152.50 941.00 -18.35% 5.00 5.00 0.00%
service:api-gateway-us
 | group by (method) | avg(size)
warm 300.12 215.56 -28.18% 16.76 25.92 +54.62% 296.50 214.50 -27.66% 326.50 253.50 -22.36% 333.50 261.00 -21.74% 25.00 25.00 0.00%
service:payment-backend-eu
AND level:[0 to 3]
 | group by (10s) | count
cold 670.80 700.00 +4.35% 119.38 63.38 -46.90% 651.50 685.00 +5.14% 765.50 756.50 -1.18% 765.50 756.50 -1.18% 5.00 5.00 0.00%
service:payment-backend-eu
AND level:[0 to 3]
 | group by (10s) | count
warm 10.76 9.64 -10.41% 0.93 0.70 -24.37% 11.00 10.00 -9.09% 12.00 10.50 -12.50% 12.50 11.00 -12.00% 25.00 25.00 0.00%
k8s_pod:payment-backend-us-*
AND level:[0 to 3]
AND NOT message:'Health check failed'
AND NOT message:'slow query detected'
cold 632.40 580.80 -8.16% 108.02 47.02 -56.47% 572.50 568.00 -0.79% 745.00 620.50 -16.71% 745.00 620.50 -16.71% 5.00 5.00 0.00%
k8s_pod:payment-backend-us-*
AND level:[0 to 3]
AND NOT message:'Health check failed'
AND NOT message:'slow query detected'
warm 12.68 8.32 -34.38% 1.35 0.48 -64.61% 13.00 8.00 -38.46% 14.50 9.00 -37.93% 15.50 9.00 -41.94% 25.00 25.00 0.00%
k8s_pod:payment-backend-us-*
AND message:"this error doesn't exist, do not try to find it"
AND level:[0 to 3]
cold 585.20 683.80 +16.85% 58.84 190.78 +224.21% 574.50 579.50 +0.87% 634.50 866.50 +36.56% 634.50 866.50 +36.56% 5.00 5.00 0.00%
k8s_pod:payment-backend-us-*
AND message:"this error doesn't exist, do not try to find it"
AND level:[0 to 3]
warm 3.32 3.28 -1.20% 0.56 0.54 -2.73% 3.00 3.00 0.00% 4.00 4.00 0.00% 4.50 4.50 0.00% 25.00 25.00 0.00%
service:payment-backend-us
AND level:[0 to 3]
cold 384.80 392.80 +2.08% 84.23 37.68 -55.27% 348.50 374.50 +7.46% 448.50 432.50 -3.57% 448.50 432.50 -3.57% 5.00 5.00 0.00%
service:payment-backend-us
AND level:[0 to 3]
warm 6.56 6.20 -5.49% 1.19 0.87 -27.41% 6.00 6.00 0.00% 8.00 7.50 -6.25% 9.50 8.50 -10.53% 25.00 25.00 0.00%
k8s_pod:payment-backend-us-*
AND transaction_id:'*needle*'
cold 796.20 715.60 -10.12% 167.17 112.07 -32.96% 753.00 685.00 -9.03% 948.50 811.00 -14.50% 948.50 811.00 -14.50% 5.00 5.00 0.00%
k8s_pod:payment-backend-us-*
AND transaction_id:'*needle*'
warm 32.04 27.84 -13.11% 3.87 1.60 -58.66% 33.00 29.00 -12.12% 36.00 29.00 -19.44% 39.00 29.00 -25.64% 25.00 25.00 0.00%
k8s_pod:payment-backend-us-*
AND transaction_id:'tx-needle00-0000'
cold 472.00 485.00 +2.75% 61.40 87.22 +42.05% 445.00 427.50 -3.93% 519.00 580.00 +11.75% 519.00 580.00 +11.75% 5.00 5.00 0.00%
k8s_pod:payment-backend-us-*
AND transaction_id:'tx-needle00-0000'
warm 7.72 4.16 -46.11% 1.17 0.37 -68.11% 7.00 4.00 -42.86% 9.50 5.00 -47.37% 11.00 5.00 -54.55% 25.00 25.00 0.00%
service:api-gateway-us
AND status:404
AND resource:'/api/v1/audit/*'
 | group by (client_ip) | count
cold 2181.80 1879.00 -13.88% 121.65 127.71 +4.98% 2117.00 1833.00 -13.42% 2307.50 1997.00 -13.46% 2307.50 1997.00 -13.46% 5.00 5.00 0.00%
service:api-gateway-us
AND status:404
AND resource:'/api/v1/audit/*'
 | group by (client_ip) | count
warm 1236.08 828.24 -32.99% 28.29 25.29 -10.60% 1230.00 826.50 -32.80% 1272.50 865.50 -31.98% 1295.00 885.50 -31.62% 25.00 25.00 0.00%
k8s_pod:payment-* | group by (10s) | count
cold 2528.40 2387.40 -5.58% 306.92 122.27 -60.16% 2504.50 2363.50 -5.63% 2771.00 2500.00 -9.78% 2771.00 2500.00 -9.78% 5.00 5.00 0.00%
k8s_pod:payment-* | group by (10s) | count
warm 285.28 279.44 -2.05% 37.50 27.29 -27.23% 275.00 279.50 +1.64% 348.50 320.50 -8.03% 369.50 328.00 -11.23% 25.00 25.00 0.00%
service:api-gateway-us
AND method:'GET'
AND status:200 AND size:[990 TO *]
AND resource:'/assets/css/bootstrap.css'
cold 970.80 1054.60 +8.63% 142.64 143.49 +0.60% 888.00 1007.00 +13.40% 1119.50 1190.00 +6.30% 1119.50 1190.00 +6.30% 5.00 5.00 0.00%
service:api-gateway-us
AND method:'GET'
AND status:200 AND size:[990 TO *]
AND resource:'/assets/css/bootstrap.css'
warm 21.96 11.48 -47.72% 2.81 0.92 -67.27% 23.00 11.00 -52.17% 24.00 13.00 -45.83% 24.50 13.50 -44.90% 25.00 25.00 0.00%
service:payment-backend-us
AND NOT message:'Health check failed'
AND NOT message:'slow query detected'
AND NOT message:'Database query executed'
AND NOT message:'API rate limit'
AND NOT message:'Fraud detection'
AND NOT message:'Configuration reload completed'
AND NOT message:'SSL certificate'
cold 117.80 136.80 +16.13% 26.86 59.54 +121.64% 104.50 99.50 -4.78% 146.00 197.00 +34.93% 146.00 197.00 +34.93% 5.00 5.00 0.00%
service:payment-backend-us
AND NOT message:'Health check failed'
AND NOT message:'slow query detected'
AND NOT message:'Database query executed'
AND NOT message:'API rate limit'
AND NOT message:'Fraud detection'
AND NOT message:'Configuration reload completed'
AND NOT message:'SSL certificate'
warm 3.64 3.80 +4.40% 0.57 0.41 -28.20% 4.00 4.00 0.00% 4.00 4.00 0.00% 4.50 4.00 -11.11% 25.00 25.00 0.00%

Have a great time!

@cheb0 cheb0 marked this pull request as ready for review February 20, 2026 05:51
@eguguchkin eguguchkin added this to the v0.69.0 milestone Mar 2, 2026
@eguguchkin eguguchkin requested review from dkharms and eguguchkin March 2, 2026 11:04
continue
}

idx, found := util.GallopSearchLeq(it.lids, nextID.Unpack())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it really perform better then ordinary binary search?

Like, additional contribution of this algorithm -- calculating upper bound for binary search (or lower bound for leq case).

if vals[0] >= x {
return 0, true
}
hi := 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess here you can calculate low as well.

I am not sure though that this function is really useful since we have pretty small LID blocks (64k LIDs per block, and IIRC you wanted to reduce this number to 4k) or I am tripping?

func GallopSearchGeq(vals []uint32, x uint32) (idx int, found bool) {
	n := len(vals)
	if n == 0 {
		return 0, false
	}
	if vals[0] >= x {
		return 0, true
	}

	hi := 1
	for hi < n && vals[hi] < x {
		hi *= 2
	}

	end := min(n, hi+1)
	start := end / 2

	idx, found = slices.BinarySearch(vals[start:end], x)
	idx += start

	if idx >= end {
		return 0, false
	}

	return idx, true
}

Please be aware that I did not properly test it -- it's just raw idea (however, your test-cases are green).

for {
for len(it.lids) == 0 {
if !it.tryNextBlock {
return node.NewLIDOrderDesc(math.MaxUint32)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Please do not forget to use node.NullLID

it.counter.AddLIDsCount(len(it.lids))
}

// fast path: smallest remaining > nextID => skip entire block
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it's worth leaving a TODO here with something like:

We could also pass LID into narrowLIDsRange to perform block skipping once we add something like MinLID to LID block header

You've mentioned this on PR description:

It's also a base for future improvements like skipping disk reads

It will eliminate such "fast" path.

@@ -0,0 +1,49 @@
package util
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've asked LLM to generate benchmarks for different scenarios and block sizes. Here is the reults:

GallopFront/4k-16       34.06n ± 2%
GallopFront/8k-16       38.97n ± 1%
GallopFront/16k-16      45.51n ± 2%
GallopFront/32k-16      49.22n ± 3%
GallopFront/64k-16      51.70n ± 4%
GallopFront/128k-16     57.81n ± 4%
BinaryFront/4k-16       34.71n ± 4%
BinaryFront/8k-16       39.49n ± 2%
BinaryFront/16k-16      42.93n ± 3%
BinaryFront/32k-16      46.55n ± 2%
BinaryFront/64k-16      47.56n ± 1%
BinaryFront/128k-16     52.17n ± 3%

GallopUniform/4k-16     49.33n ± 4%
GallopUniform/8k-16     51.72n ± 2%
GallopUniform/16k-16    58.49n ± 5%
GallopUniform/32k-16    68.36n ± 4%
GallopUniform/64k-16    76.25n ± 3%
GallopUniform/128k-16   85.15n ± 5%
BinaryUniform/4k-16     42.32n ± 2%
BinaryUniform/8k-16     47.48n ± 4%
BinaryUniform/16k-16    53.79n ± 3%
BinaryUniform/32k-16    60.30n ± 2%
BinaryUniform/64k-16    65.98n ± 2%
BinaryUniform/128k-16   73.05n ± 1%

GallopBack/4k-16        40.05n ± 1%
GallopBack/8k-16        44.22n ± 2%
GallopBack/16k-16       48.66n ± 1%
GallopBack/32k-16       55.97n ± 4%
GallopBack/64k-16       60.87n ± 6%
GallopBack/128k-16      66.48n ± 5%
BinaryBack/4k-16        34.15n ± 3%
BinaryBack/8k-16        38.01n ± 4%
BinaryBack/16k-16       39.21n ± 1%
BinaryBack/32k-16       44.06n ± 6%
BinaryBack/64k-16       50.46n ± 2%
BinaryBack/128k-16      55.85n ± 2%

Seems like ordinary binary search perform even better. We can discuss it.

}
}

func NewLID(lid uint32, reverse bool) LID {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am strongly against using relative properties, honestly.

It's better to have desc bool or asc bool than reverse bool since it requires understanding of context.

// NextGeq finds next greater or equals since iteration is in ascending order
func (n *staticAsc) NextGeq(nextID LID) LID {
if n.ptr >= len(n.data) {
return NewLIDOrderDesc(math.MaxUint32)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Please do not forget to use node.NullLID

from := n.ptr
idx, found := util.GallopSearchGeq(n.data[from:], nextID.Unpack())
if !found {
return NewLIDOrderDesc(math.MaxUint32)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Please do not forget to use node.NullLID

// NextGeq finds next less or equals since iteration is in descending order
func (n *staticDesc) NextGeq(nextID LID) LID {
if n.ptr < 0 {
return NewLIDOrderAsc(0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Please do not forget to use node.NullLID

}
idx, found := util.GallopSearchLeq(n.data[:n.ptr+1], nextID.Unpack())
if !found {
return NewLIDOrderAsc(0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Please do not forget to use node.NullLID

for {
for len(it.lids) == 0 {
if !it.tryNextBlock {
return node.NewLIDOrderDesc(math.MaxUint32)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: NullLID()

continue
}

idx, found := util.GallopSearchGeq(it.lids, nextID.Unpack())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Galloping search is only more efficient when we expect the element to be located at the very beginning of the list. Otherwise, pure binary search is better and simpler.

Can we benchmark both algorithms on a real data?


// BenchmarkOrTreeNextGeq checks the performance of NextGeq vs Next when no skipping occur and all node
// yield distinct values (no intersection between nodes)
func BenchmarkOrTreeNextGeq(b *testing.B) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: just want to clarify for myself. Do I understand correctly that we don't expect acceleration specifically for only-Or-nodes, so this benchmark is to make sure there's no degradation?

// Fast path: if we at least left or right and there is nothing to skip, then choose lowest and return.
minID := Min(n.leftID, n.rightID)
if nextID.LessOrEq(minID) {
if n.leftID.Less(n.rightID) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Can we just call return n.Next() here?

    // Fast path: if we at least left or right and there is nothing to skip, then choose lowest and return.
	if nextID.LessOrEq(Min(n.leftID, n.rightID)) {
		return n.Next()
	}

}

func (n *nodeAnd) NextGeq(nextID LID) LID {
for !n.leftID.IsNull() && !n.rightID.IsNull() && !n.leftID.Eq(n.rightID) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems there might be incorrect behavior here

Imagine, this is the first call of NextGEQ and you have the left and right equal (but less than the nextID), what will the method return then?


node := NewAnd(left, right)

// Currently, nodes instantiate their state on creation, which will be fixed later.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this behavior doesn't match the expectations for the method

return NullLID()
}

func (n *nodeNAnd) NextGeq(nextID LID) LID {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is a temporary stub for the actual implementation, wouldn't it be better to do something like this?

func (n *nodeNAnd) NextGeq(nextID LID) LID {
	lid := n.Next()
	for lid.Less(nextID) {
		lid = n.Next()
	}
	return lid
}

Base automatically changed from 0-fast-cmp-xor to main March 18, 2026 16:33
# Conflicts:
#	frac/processor/aggregator.go
#	node/bench_test.go
#	node/node_and.go
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance Features or improvements that positively affect seq-db performance

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support NextGEQ for inverted index (skipping)

5 participants